Impact of Regressand Stratification in Dataset Shift Caused by Cross-Validation

نویسندگان

چکیده

Data that have not been modeled cannot be correctly predicted. Under this assumption, research studies how k-fold cross-validation can introduce dataset shift in regression problems. This fact implies data distributions the training and test sets to different and, therefore, a deterioration of model performance estimation. Even though stratification output variable is widely used field classification reduce impacts induced by cross-validation, its use widespread literature. paper analyzes consequences for including regressand schemes with data. The results obtained show these allow creating more similar sets, reducing presence related cross-validation. bias deviation estimation algorithms are improved using highest amounts strata, as number repetitions necessary obtain better results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Covariate Shift Adaptation by Importance Weighted Cross Validation

A common assumption in supervised learning is that the input points in the training set follow the same probability distribution as the input points that will be given in the future test phase. However, this assumption is not satisfied, for example, when the outside of the training region is extrapolated. The situation where the training input points and test input points follow different distr...

متن کامل

the impact of e-readiness on ec success in public sector in iran the impact of e-readiness on ec success in public sector in iran

acknowledge the importance of e-commerce to their countries and to survival of their businesses and in creating and encouraging an atmosphere for the wide adoption and success of e-commerce in the long term. the investment for implementing e-commerce in the public sector is one of the areas which is focused in government‘s action plan for cross-disciplinary it development and e-readiness in go...

Importance-Weighted Cross-Validation for Covariate Shift

A common assumption in supervised learning is that the input points in the training set follow the same probability distribution that the input points used for testing follow. However, this assumption is not satisfied, for example, when the outside of training region is inter/extrapolated. The situation where the training input points and test input points follow different distributions is call...

متن کامل

Customer Validation in Cross-Dock

Considering the importance of validation of customers in the cross-dock and since this is one of the problems of implementing cross-dock system in Iran, this study attempted to extract customer validation criteria. The purpose of the research is to eliminate the distrust of distributors in receiving the funds of the sent items and the statistical sample of this research is the experts of the sy...

متن کامل

Unsupervised stratification of cross-validation for accuracy estimation

The rapid development of new learning algorithms increases the need for improved accuracy estimation methods. Moreover, methods allowing the comparison of several different learning algorithms are important for the performance evaluation of new ones. In this paper we propose new accuracy estimation methods which are extensions of the k-fold cross-validation method. The methods proposed construc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematics

سال: 2022

ISSN: ['2227-7390']

DOI: https://doi.org/10.3390/math10142538